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ABSTRACT 


Schaffner's logic of comparative theory evaluation is critisesed 
for an inappropriate analysis of ad hocness. An alternative 
analysis, based on Zahar's account of novelty, is given and 
extended to the case of multiple successful predictions by a 
theory. The application of the method to quantitative predictions 
is discussed. 


Ad Hocness and the Appraisal of Theories. * 


1. The Bayesian Analysis of Ad Hocness. 


In a recent note Schaffner (C1974) ) has given a formal discussion 
of the notion of ad hocness in terms of a Bayesian model for the 
appraisal of theories. Schaffner develops his general ideas in 
the context of a critique of Zahar's [1973] which was concerned 
with the particular problem of comparing the Einstein and Lorentz 
research programmes. Zahar suggests the following analysis of 
ad hocness?; 
Ad hoc,: <A theory is said to be ad hoc, if it has no 


1 
novel consequences as compared with its predecessor. 


Ad hoc.: [a theory}.. is ad hoc, if none of its novel 
predictions have been actually 'verified'. 


Ad hoc: (A)... theory is said to be ad hoc, if it is 
obtained from its predecessor through a modification of 
the auxiliary hypotheses which does not accord with the 
spirit of the heuristic of the programme. 

Zahar explains the meaning of novelty as follows* 
A fact will be considered novel with respect to a given 
hypothesis if it did not belong to the problem-situation 
which governed the construction of the hypothesis, 


Schaffner begins his elucidation by discussing the notions of 
ad hoc, and ad hoc. The first he describes as a logical dream, 
since the novel consequences of a theory cannot in practice be 
"surveyed", so the question of whether a theory is ad hoc, can 
only be discussed relative to the extent to which novel 
consequences have been looked for at the particular epoch of 
the evaluation. Ad hoc, Schaffner claims to be "vague to the 
point of inapplicability". For Schaffner ad hoc, is "close to 

CO ee ee eee 
* IT am grateful to Donald Gillies, Jon Dorling and Noretta Koertge 
for helpful comments and suggestions. The present work formed part 
of a paper read in Professor Watking’ seminar at The London School 


of Economics in January 1976. 


the sense in which ad hoc is used in science" and he announces 
that his Bayesian analysis will be brought to bear on this 


sense of ad hoc. 


Denoting by p(T/bx e) the probability of a theory T to be true 
in the light of background knowledge b and the positive outcome 
e of some experiment: not part of b, by p(T/b) the prior 
assessment of T, by p(e/TX&b) the probability of obtaining e 
given T and b, and by p(e/b) the probability of obtaining e on 
the basis of background knowledge alone, Schaffner writes 
p(T/bX e) = p(Yo).pCe/TX db) (1) 
ple/b 
If T explains e we can set p(e/T¥) = 1 so in this case we 
obtain 
p(T/LRe) = p(t/b) /ple/t) (2) 
Schaffner proceeds to discuss ad hocness as a property of an 
hypothesis as/constituent part of a theory, but in order 
to keep the argument as simple as possible we shall follow 
.Zahar in considering the ad hocness of theories, Translated 
into these terms Schaffner's idea is that a theory T gives 
an ad hoc explanation of an experimental result e if p(T/é ) 
is close to zero and p(e/?)) is significantly larger than 
p(T/2). The argument for p(T/2) peing small is that it 
has no "theoretical support" (or indeed empirical support 
other than e itself). This looks suspiciously like a 
reference to ad hoc,, and Schaffner now appears to be claiming 
that a theory is ad hoc, partly in virtue of its being ad hoc. 
So he is not really giving an independent analysis of ad hoc, 
at all. 


The lack of novelty in the prediction of e is associated by 
Schaffner with a high value for p(e/2}). On Zahar's account 

this is a necessary condition for lack of novelty, but not a 
sufficient condition. We proceed to show how using Zahar's notion 


“ 
of novelty one can get an "internal Bayesian analysis of ad hoc,. 


We express Bayes's theorem in the following familiar way 


p(T/bX e) = p(T/b), p(e/TX b) 
p(e/TR b)p(T/b) + Sree PCT b 


aeons denotes the negation of T. 


We follow Schaffner - interpreting p(A/B) as the degree of 
belief that A is true given that B is true. Note also that e 
is supposed in equation (3) to refer to a prediction made by 
the theory T (we could perhaps write Em to emphasize this 
important point) so p(e/~T&b) means the probability that the 
prediction en derived from the theory T is a true prediction 
given that the background information b is true but the 

theory T is false (i.e. its consequences are not guaranteed to 
be true although "by accident" they may be true). 


Writing p(T/b) = x, p(leATRb) = £ 


and taking p(e/T£ b) = 1 and using p(eT/b) = 1- x 
x 
XxX +E(1 - x) 


p(T/bYe) = (4) 


We define an enhancement ratio Y by 


p(T/b& e) 
x p(T/b) 


whence using (4) we obtain the simple result 


Ne - Seca ee es (5) 


x +€(1 - x) 


We can now explain’that if a theory T is ad hoc, with respect to 
the experiment e then € = 1, i.e. the explanation of e by T in 

no way depends on the truth or falsehood of T, both of which 
eventualities lead with certainty to the result e. This is just 
what a scientist means when he says T was an ad hoc explanation of 
e, namely T was devised for the express purpose of explaining e, 


so the explanation of e is guaranteed independent of whether 

T is true or false. To show the consistency of our analysis 

if we put € = 1 in (5) we pet Y = 1, so the posterior and prior 
probabilities of T are equal (there is no enhancement) and this 
again is just what we expect from an ad hoc explanation of e, 
namely e itself gives us no additional information for assessing 
the truth of T.2 | | 

Notice that p(e/b) = x +£(1- x) is equal to unity if §=1, 
but that p(e/?2-) = 1 is achieved for x = 1 whatever the rmglue 
of © , so the explication of novelty in terms of low p(e/é) 
(Schaffner) and small & (Zahar) are by no means equivalent. 


On our analysis if xZ€¢1, then Yo Yé , so in this case we 

get a big enhancement and the theory is far from being ad hoc. 
Noting that under these conditions ple/t )XE, we see that this is 
a situation in which Schaffner would claim that the theory was 

ad hoc, which highlights the way in which his analysis differs 

from ours. Effectively Schaffner requires Y X to be small as 

his condition of ad hocness. He is thus concerned with the 
absolute value of the probability of a theory after it has 
explained some experimental finding. If this absolute value 

is still small the theory is to be regarded as an ad hoc explanation 
of the experiment. On our account the important aspect in 
assessing ad hocness for this case is not the absolute value of 

the probability but the enhancement ratio. Our point against 
Schaffner is not that his analysis may not explicate some 
legitimate sense of the ambiguous appellation ad hoc, but that it 
fails to explicate Zahar's very important notion of ad hoes. There 
is no inconsistency here on Schaffner's part, since he finds 
Zahar's account of ad hoc, inadequate in respect of the definition 
of novelty involved with its >_——8€ historical associations, 
but we would maintain that Zahar's sense of ad hoc, is the one that 
ought to be explicated since it is the one most importantly used 


in science, 


2. The Case of Multiple Predictions 


To develop our analysis a little further we can consider how a 
theory builds up a favourable appraisal as it makes a number of 
successful predictions €,a Cne+-€, Say. Denote by Pi, the 
posterior probability p(T/bg e,& Pe ae ) after s eie resent 
predictions. Assuming for simplicity that suecessiw@ appraisals 
are all associated with the same value of & » we can clearly 
write 


Pa = yO y VOD yy YD), Po 


where Po 2 Xx according to our previous notation 
1 
and xX 7 a ee. 
Psa] L-E) +E 
- v(stl) 
Psea * Y ‘ 


The solution of WAS recursion is by inspection or more simply 
by replacing £€ by ee in the formula for Py (see (4) above). We 
obtain ' 
= L 

ii 1-6" EF, (6) 
We can also ask what is the probability for the (n+1) th 
prediction Cnet being correct if the theory has already made n 
successful predictions Cpr eeO Denoting this probability 


n 
by p(@n+4) we clearly obtain the result 


p(€n+1) 2 € + paccatli Escapes 
ree 4 EM/y (7) 
If € is a small quantity (i.e. &1) which will be the case if T 
is non-ad hoc, with respect to all the predictions, we can 
write the following formulae which will be perfectly satisfactory 
for the subsequent discussion 


Pn ae eed 
1+¢"/x (8) 


(n) me i (10) 


We note the following features 
(1) The value .of pn depends entirely on the ratio £™sx , 
For €"/x >>1, py Kl 
and for £"/x 1, pam 1, 


At the critical point €" = x we have pn a 1/2, 
So if initially x&€ as n increases Pn will rise 
steeply as we reach the value n - Lnx/bp € ‘ 


(2) So long as Pn&E we have p(€@n+1) 2 £, but as Pin builds 
up towards unity so does p(@n+1). 


; cn 
(3) So long as Pn -1 €E,Y - Ve » but as bn - 4 builds 
up towards unity the enhancement factor y(n) also 
tends to unity. 


To take a concrete case we illustrate in the accompanying figure 
Pn and p(£n+4) as functions of n for the particular choice 


x=O0.01, € =0.1. 
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3. Quantitative Predictions 


We can apply this analysis to the important case of quantitative 
predictions. Suppose a theory T predicts correctly an 
experimental result which is known to an accuracy of n significant 
figures. We assume as part of our background knowledge that 

the order of magnitude of the result is known, i.e. we 

disregard the prediction of zeros occuring before or after the 
significant figures in the experimental result. As a concrete 
illustration we cite the theoretical predictions by quantum 
electrodynamics of the anomaly in the hydrogen spectrum known as 
the Lamb shift and the anomaly in the magnetic moment of the 
electron. For example the latter is now known experimentally 

to be (1159652442) x 197° Bohr magnetons~ whereas the theoretical 
prediction is (1159652446) x i919 Bohr magnetons. “ We thus 
have remarkable agreement to seven significant figures. Clearly 
if we regard the prediction of each significant figure as an 
independent event then the appropriate value to take for & is 
O.1 since a false theory would have ten equal possibilities for 
filling in each digit. ‘The question of what value to take for 

x is somewhat arbitrary. In agreement with Schaffner we do not 
follow a purely logical approach and set x = 0. For our purpose 
x reflects the scientist's confidence in the new theory T. One 
could argue that a scientist would not spend great efforts 
developing the consequences of a theory he did not believe in. 

By analogy with the situation in the Bayesian analysis of 
significance testing (see for example Redhead ((z974l)y) we could 
take x = 3}. Perhaps more realistically we should take x around 
0.01 and adopt the sociological rule that unless a seientist has 
a one per cent level of confidence in the truth of his theory he 
would not seriously investigate it?, With this choice of 
parameters we see that the build-up of confidence in a theory which 
makes correct quantitative predictions, as the accuracy of the 
experiment increases, would be illustrated by the graph for pn in 
the Pires © Of course p(2nay) is only given by our analysis 
so long as other factors which could potentially influence the 
results are known not to be significant. For a certain value 


of n this condition will fail. For example in the case of 

the anomalous magnetic moment of the electron the effect of 
hadronic couplings introduce theoretical uncertainties which 
would ultimately make the prediction of the theory unreliable. 1° 


4, Conclusion 


The concept of a novel prediction plays a very important part 

in the way scientists assess their confidence in theories. 

Our analysis has shown how the intuitive appraisal by scientists 
in this respect can be understood if degrees of commitment 

are governed by Bayesian rationality constraints. 


FOOTNOTES 


1, 


Zahar {2973 | p. 101. In his fiszal Zahar rephrases 

his definition in terms of a notion of empirical 

non-ad hocness expressed as a three-place relation 
between an observation statement, 4 theory, and a 
heuristic. See also in this connection the very clear 
account of empirical support given by Worrall in 

his [1975]. However this more recent work has somewhat 
obscured the important distinctions drawn by Zahar in 


his f2973] =. 
Zahar [2973] p.103. 


What we have shown is that ae 1 is a necessary condition 
for T to be an ad hoc, explanation of e. To justify 

= 1 as a sufficient condition we must invoke a principle 
of insufficient reason, viz.,if there is no reason for 
the community of scientists to entertain with non-vanishing 
prior probability only theories which explain e, then they 
will not so constrain their choice of alternative theories. 


This result will be recognized as a special case of the 
general treatment given by Keynes in his [asaal, pp.235-6. 


The successful detailed quantitative predictions of a 
theory in respect of phenomena quite different from those 
which the theory was originally proposed to deal with have 
always attracted the attention of scientists. To take an 
example at random, in referring to his early work on the 
ground state of Helium Hylleraas comments in his fh963] 
(p. 42) "The end result of my calculation was...greatly 
admired and thought of as almost a proof of the validity 
of wave mechanics...in the strict numerical sense". 


Van Dyck et al.({1977]). 


10, 


See value quoted in Calmet et al. ( (i977]). For a good 
account of the fluctuating agreement between theory and 
experiment the reviews by Lautrup, Peterman and de Raphael 
( G972)) or Rich and Wesley ({1972)) may be consulted. 


We may refer to Shimony's concept of commitment to a 
theory (see his [1970] pp.94-95). The degree. of 
commitment measures the scientist's belief that the 
theory T belongs to the equivalence class of all theories 
which give the same true observational predictions within 
th@ domain of current experimentation and that the "true" 
theory "generalizes" in some sense the concepts embodied 
in T. Commitment measures our belief, not that a theory 
is true, but that it points the way to the truth. 


It is easy to see that our confidence in a theory at 

a given level of accuracy for agreement between theory 
and experiment does not depend on the particular scale 
of notation used to express the result. 


According to Rich and Wesley (2972 ]) the known hadronie 
contribution to the electron anomaly would affect the 
tenth significant figure. 
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